Re: Hash index todo list item - Mailing list pgsql-hackers

From Jens-Wolfhard Schicke
Subject Re: Hash index todo list item
Date
Msg-id 6CD3179DA44575D994F805D7@[192.168.1.85]
Whole thread Raw
In response to Re: Hash index todo list item  (Mark Mielke <mark@mark.mielke.cc>)
List pgsql-hackers
--On Samstag, September 08, 2007 18:56:23 -0400 Mark Mielke 
<mark@mark.mielke.cc> wrote:
> Kenneth Marshall wrote:
> Along with the hypothetical performance
> wins, the hash index space efficiency would be improved by a similar
> factor. Obviously, all of these ideas would need to be tested in
> various workload environments. In the large index arena, 10^6 to 10^9
> keys and more, space efficiency will help keep the index manageable
> in todays system memories.
>
>
> Space efficiency is provided by not storing the key, nor the header data
> required (length prefix?).
Space efficiency at ~1 entry per bucket: How about using closed hashing, 
saving in each page a bitmask in front which specifies which entries hold 
valid entries and in the rest of the page row-pointers (is this the correct 
expression? I don't know...) without further data. Should provide 
reasonably simple data structure and alignment for the pointers.

> Please keep the ideas and comments coming. I am certain that a synthesis
> of them will provide an implementation with the performance
> characteristics
> that we are seeking.

One should look into new plan nodes for "!= ANY()", "NOT EXISTS" and 
similar. A node like "look into hash and true if bucket is empty" would 
work without checking tuple visibility when the bucket is empty and could 
be a win in some situations.

Do we want special cases for short keys like INT4? In those cases the 
implementation might use hash == key and put that knowledge to use in 
plans. Even a unique constraint might then be doable. Does the 
postgresql-storage backend on linux support sparse files? Might be a win 
when holes in the sequence turn up.




pgsql-hackers by date:

Previous
From: db@zigo.dhs.org
Date:
Subject: Re: invalidly encoded strings
Next
From: Simon Riggs
Date:
Subject: Include Lists for Text Search